Optimizing the performance of reactive molecular dynamics simulations for many-core architectures

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing the Performance of Reactive Molecular Dynamics Simulations for Multi-Core Architectures

Reactive molecular dynamics simulations are computationally demanding. Reaching spatial and temporal scales where interesting scientific phenomena can be observed requires efficient and scalable implementations on modern hardware. In this paper, we focus on optimizing the performance of the widely used LAMMPS/ReaxC package for multi-core architectures. As hybrid parallelism allows better levera...

متن کامل

A Cross-Core Performance Model for Heterogeneous Many-Core Architectures

An accurate performance predictor to identify the most suitable core-architecture to execute each thread/workload in a heterogeneous many-core structure is proposed. The devised predictor is based on a linear regression model that considers several different parameters of the many-core processor architectures, including the cache size, issuewidth, re-order buffer size, load/store queues size, e...

متن کامل

Automatically Optimizing Stencil Computations on Many-Core NUMA Architectures

This paper presents a system for automatically supporting the optimization of stencil kernels on emerging Non-Uniform Memory Access(NUMA) many-core architectures, through a combined compiler + runtime approach. In particular, we use a pragma-driven compiler to recognize the special structures and optimization needs of stencil computations and thereby to automatically generate low-level code tha...

متن کامل

Many Core Hardware Architectures

متن کامل

Parallel Dual Tree Traversal on Multi-core and Many-core Architectures for Astrophysical N-body Simulations

In astrophysical N -body simulations, Dehnen’s algorithm, implemented in the serial falcON code and based on a dual tree traversal, is faster than serial Barnes-Hut tree-codes, but outperformed by parallel CPU and GPU tree-codes. In this paper, we present a parallel dual tree traversal, implemented in the pfalcON code, targeting multi-core CPUs and manycore architectures (Xeon Phi). We focus he...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: The International Journal of High Performance Computing Applications

سال: 2018

ISSN: 1094-3420,1741-2846

DOI: 10.1177/1094342017746221